Search CORE

11,540 research outputs found

Image Aesthetics Assessment Using Composite Features from off-the-Shelf Deep Models

Author: Fan Cien
Fu Xin
Yan Jia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/02/2019
Field of study

Deep convolutional neural networks have recently achieved great success on image aesthetics assessment task. In this paper, we propose an efficient method which takes the global, local and scene-aware information of images into consideration and exploits the composite features extracted from corresponding pretrained deep learning models to classify the derived features with support vector machine. Contrary to popular methods that require fine-tuning or training a new model from scratch, our training-free method directly takes the deep features generated by off-the-shelf models for image classification and scene recognition. Also, we analyzed the factors that could influence the performance from two aspects: the architecture of the deep neural network and the contribution of local and scene-aware information. It turns out that deep residual network could produce more aesthetics-aware image representation and composite features lead to the improvement of overall performance. Experiments on common large-scale aesthetics assessment benchmarks demonstrate that our method outperforms the state-of-the-art results in photo aesthetics assessment.Comment: Accepted by ICIP 201

arXiv.org e-Print Archive

Crossref

Diversified Top-k Graph Pattern Matching

Author: Fan Wenfei
Wang Xin
Wu Yinghui
Publication venue
Publication date: 01/01/2013
Field of study

Edinburgh Research Explorer

Performance Guarantees for Distributed Reachability Queries

Author: Fan Wenfei
Wang Xin
Wu Yinghui
Publication venue
Publication date: 01/01/2012
Field of study

In the real world a graph is often fragmented and distributed across different sites. This highlights the need for evaluating queries on distributed graphs. This paper proposes distributed evaluation algorithms for three classes of queries: reachability for determining whether one node can reach another, bounded reachability for deciding whether there exists a path of a bounded length between a pair of nodes, and regular reachability for checking whether there exists a path connecting two nodes such that the node labels on the path form a string in a given regular expression. We develop these algorithms based on partial evaluation, to explore parallel computation. When evaluating a query Q on a distributed graph G, we show that these algorithms possess the following performance guarantees, no matter how G is fragmented and distributed: (1) each site is visited only once; (2) the total network traffic is determined by the size of Q and the fragmentation of G, independent of the size of G; and (3) the response time is decided by the largest fragment of G rather than the entire G. In addition, we show that these algorithms can be readily implemented in the MapReduce framework. Using synthetic and real-life data, we experimentally verify that these algorithms are scalable on large graphs, regardless of how the graphs are distributed.Comment: VLDB201

arXiv.org e-Print Archive

Edinburgh Research Explorer

Feature Augmentation via Nonparametrics and Selection (FANS) in High Dimensional Classification

Author: Fan Jianqing
Feng Yang
Jiang Jiancheng
Tong Xin
Publication venue
Publication date: 02/01/2015
Field of study

We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.Comment: 30 pages, 2 figure

arXiv.org e-Print Archive

Princeton University Open Access Repository

Are Discoveries Spurious? Distributions of Maximum Spurious Correlations and Their Applications

Author: Fan Jianqing
Shao Qi-Man
Zhou Wen-Xin
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 21/07/2017
Field of study

Over the last two decades, many exciting variable selection methods have been developed for finding a small group of covariates that are associated with the response from a large pool. Can the discoveries from these data mining approaches be spurious due to high dimensionality and limited sample size? Can our fundamental assumptions about the exogeneity of the covariates needed for such variable selection be validated with the data? To answer these questions, we need to derive the distributions of the maximum spurious correlations given a certain number of predictors, namely, the distribution of the correlation of a response variable

Y

with the best

s

linear combinations of

p

covariates

\mathbf{X}

, even when

\mathbf{X}

and

Y

are independent. When the covariance matrix of

\mathbf{X}

possesses the restricted eigenvalue property, we derive such distributions for both a finite

s

and a diverging

s

, using Gaussian approximation and empirical process techniques. However, such a distribution depends on the unknown covariance matrix of

\mathbf{X}

. Hence, we use the multiplier bootstrap procedure to approximate the unknown distributions and establish the consistency of such a simple bootstrap approach. The results are further extended to the situation where the residuals are from regularized fits. Our approach is then used to construct the upper confidence limit for the maximum spurious correlation and to test the exogeneity of the covariates. The former provides a baseline for guarding against false discoveries and the latter tests whether our fundamental assumptions for high-dimensional model selection are statistically valid. Our techniques and results are illustrated with both numerical examples and real data analysis

arXiv.org e-Print Archive

Princeton University Open Access Repository

eScholarship - University of California